Collection of Internet

home *** CD-ROM | disk | FTP | other *** search

/ Collection of Internet / Collection of Internet.iso / infosrvr / doc / www_talk.arc / 000177_connolly@pixel.convex.com _Wed Jul 15 00:26:56 1992.msg < prev next >

Wrap

Internet Message Format | 1992-11-30 | 3KB

Return-Path: <connolly@pixel.convex.com> Received: from dxmint.cern.ch by nxoc01.cern.ch (NeXT-1.0 (From Sendmail 5.52)/NeXT-2.0) id AA02064; Wed, 15 Jul 92 00:26:56 MET DST Received: by dxmint.cern.ch (dxcern) (5.57/3.14) id AA13260; Wed, 15 Jul 92 00:26:28 +0200 Received: from pixel.convex.com by convex.convex.com (5.64/1.35) id AA18318; Tue, 14 Jul 92 17:25:58 -0500 Received: from localhost by pixel.convex.com (5.64/1.28) id AA07409; Tue, 14 Jul 92 17:25:57 -0500 Message-Id: <9207142225.AA07409@pixel.convex.com> To: timbl@nxoc01.cern.ch (Tim Berners-Lee) Cc: www-talk@nxoc01.cern.ch Subject: Re: rethinking the HTML DTD. In-Reply-To: Your message of "Wed, 15 Jul 92 00:03:56 +0700." <9207142203.AA02008@ nxoc01.cern.ch > Date: Tue, 14 Jul 92 17:25:56 CDT From: Dan Connolly <connolly@pixel.convex.com> Ok, so we really do want to use SGML. Good. I agree. I just wanted to hear from the WWW community. > >You say HTML is not SGML. It is true that the HTML generted by the NeXT editor >is not good. (example, lack of quotes around attributes which need them.) >Hwoever, the current parser wil parse real SGML. > The biggest problem with HTML files is that they have only 1 of the 3 basic parts of an SGML document: the SGML declaration, the prologue, and the instnace. HTML documents only have the instance. It's legal to omit the SGML declaration -- there's a default. But you've got to have a prologue, or you end up with a non-standard way of infering the prologue (for example, every WWW client infers the DTD described in "http://info.cern.ch/hypertext/WWW/MarkUp/Tags.html".) So if we're commited to SGML, let's start putting something like <!DOCTYPE HTML SYSTEM "http://info.cern.ch/hypertext/WWW/MarkUp/html.dtd"> at the front of every HTML file (we don't have to store it in the file -- servers that distribute HTML could generate it on the fly.) And let's put _some_ kind of DTD there. >In the future, the web will inclued more complex DTDs, and dynamically >loaded DTDs, and people will want to use the same parser for it. > Interesting! There are plans to support more than one DTD! This makes SGML a clear winner. >So I feel RTF would be a backward step. It is true that the current >W3 software is at a point level with RTF rather than general SGML. >But why tie ourselves to that point? > I guess that's what I wanted to hear: that the goals of WWW and the features of SGML really _do_ have a lot in common, but the current implementation doesn't support many of them. Just to make sure I've beat this horse to death: let's begin to formalize HTML and validate existing HTML documents before the distance between HTML and SGML gets too big. Dan p.s. I'm working on a DTD that reflects the structure of most existing word-processor documents: a sequence of paragraphs (maybe broken into flows, sections, or whatever). I'll have RTF and MIF translators for the DTD when it's ready. Maybe HTML2 can use some of the features -- the low level character-set related features, anyway.